3D Coding TAT - S-Buffer (Span Buffer Techniques)
After reading the few snippets of info about this method it is perhaps one of the more easy to understand rendering methods. The idea behind it is the build a list of horizontal line spans (or vertical ones) which can be quickly used to draw the screen in a continuous manner, drawing one line span after the other until the entire screen has been drawn. I believe that the X-ngine used in Terminator Future Shock and Skynet used this technique (this is only a guess). With the usual depth-sorting and polygon drawing there is sometimes a large amount of pixel overdraw as more distant objects are partially (or entirely) hidden by closer ones. These days polygons have a heavy 'per-pixel' workload (texture mapping and shading) so the least number of these overdrawn pixels the better. Even with the latest super Pentium powered Cray wanna-be the video system on PCs compared to the system memory well, sucks BIG-TIME! If you are lucky it's ONLY a 50% reduction in performance but it is normally far worse than this.
To give you an example I tested my Pentium 233 using Display-Doctor to give an indication of performance. For system memory it gave 112 Mb/sec, but on the video-system it gave only 15 MB/sec, which is 'slightly' slower, wouldn't ya say?
Imagine we have 3 polygons at various depths from our view-point.
e.g.
aaaaaaaaa <-- most distant
bbbbbbbb
ccccccccccccccccccccc <-- closest
|
view point (plan view)
The resulting screen scan-line after the above 3 polygon spans would look something like this:
bbbbccccccccccccccccccccc
(front view)
You should note that all the pixels from polygon 'a' have been completely overdrawn by the b and c polygon line spans. So instead of drawing 3 horizontal line spans on the screen we can just draw 2 (one of which is clipped, span 'b'). The essence of the S-BUFFER algorithm is to build up a complete set of span-lists by clipping the coordinates of each span as it is inserted into the list. The order in which these spans are clipped and entered is unimportant because overlapping spans are either replaced by later ones or rejected because they are entirely behind previous ones. Take this method and apply it to every horizontal line on your screen and you basically have the S-BUFFER technique. You should hopefully end up with a list of horizontal line spans which represent all the clipped polygons spans and WITHOUT any overdrawing of pixels.
A good side effect of this build-then-draw strategy is that it is possible to get away with only having a single, physical screen page (like VGA mode 13h). The old traditional method for sprites or polygons was to erase the entire screen and then draw the objects in their new positions. This could (and usually would) lead to a flickery display or 'tears' in objects as the video raster beam interrupts the drawing of them. For really good quality, NO-FLICKER animation two video pages are needed. One is displayed while the other off-screen one is being built up. After this the pages are swapped over and the just-built page is displayed while the next is built. The S-BUFFER techniques makes it possible to get away with a single screen page because nothing is being erased or partially drawn, everything is being overwritten in a top-to-bottom order. This might cause a slight glitch or tear because we are drawing to the physically displayed screen memory but it does mean that for video modes with a single page drawing can occur very quickly. No doubt it will be slightly faster then building an off-screen buffer image and block copying it onto the physical screen page.
1: Initialisation
The S-Buffer needs to be set up before each render-cycle so each horizontal line's list needs to be emptied. But instead of creating a lot of null lists why not build a list of spans for a background 'horizon' bitmap this way we don't have the clear the screen before drawing the spans, also only the visible parts of the background will be drawn rather than the entire bitmap and having some parts overdrawn by the horizontal line spans.
If these background spans are given a pseudo depth like all the other spans then this can be used as a detail level control. Simply move the background Z distant and all the landscape and object's polygons will be clipped up it.
2: Time to Clip ?
There is a modest performance gain which can be found in the span clipping part when spans are entered in the span-lists. Imagine that we are drawing texture-filled polygons, so each horizontal span will have 3 coordinates for each end point (more if shading is performed).
Left end-point Right end-point
---------------- -----------------
Screen (X) Screen (X)
Texture (U,V) Texture (U,V)
Now we already have a horizontal line span in our S-buffer and we need to clip and insert another one which partially covers the first.
e.g.
aaaaaaaaaaaaaa <--- span 1
bbbbbbbbbbbbb <--- span 2
So we must clip the part of span 1 which is hidden behind span 2. This would require clipping of the right Screen (X) and Texture(U,V) coordinates. The Screen (X) is not a problem and can be done with a simple MOV instruction, but it's the clipping in texture-space to find the right clipped (U,V) coordinates which is the performance bottleneck. In the worst case 2 multiplies and 2 divides! And remember this is just ONE span on ONE line. You can imagine if lots more polygon lines are drawn there will be many instances when the texture (U,V) coordinates must be clipped.
In the above description the texture (U,V) clipping was done at the same time as inserting the span into the S-Buffer, but we can also clip as we draw each horizontal line span onto the screen. Although some more work must be done during the drawing stage it can save a great deal of time. For example say we inserted another span, 3 which completely covered both the 1 and 2 spans.
e.g.
aaaaaaaaaaaaaa <--- span 1
bbbbbbbbbbbbb <--- span 2
cccccccccccccccccccccccccccccc <--- span 3
Now both span 2 and the just clipped span 1 is now completely hidden by span 3, this means that the right clipping of span 1 texture-coordinates (U,V) has been clipped for nothing. If we added more and more spans then it is possible that more and more previous spans will be totally or partly hidden. Let's face it, this is the main characteristic of having many, many overlapping line spans that the S-BUFFER algorithm relies on.
All we need to do is to clip the screen (X) coords of the span and keep a record of its original X coord. This way after ALL the horizontal line spans have been built up in the S-Buffer we can set about clipping and drawing each texture-filled line. This does add a little more work in the drawing stage but has saved more in the span insertion stage, and don't forget the savings when spans are totally hidden by closer ones. This means we only ever clip the texture (U,V) coordinates for visible or partly clipped spans and NOT for totally overdrawn ones. To perform this clipping at draw time we need to keep either the original (unclipped) screen X coordinates of the span or a clipping length (where 0 = no clipping) together with the normal S-Buffer information. This extra storage is only about 2 words per horizontal span.
All in all I think that this is a good solution as it simplifies the insertion stage and adds a very small amount of extra work to the drawing stage, so it balances up the two. As most of the data handling is done in the insertion stage this should create a shorter and faster processing pipeline (and that can't be bad).
This later clip stage can also be used to quickly insert spans which cross the left and right screen edges. Once all of the inserting has been done it is highly probable that these partly off-screen spans have already been replaced by others in the S-BUFFER.
3: Fences, Water & Tints
The basic S-BUFFER (span buffer) rendering technique in its simplest form is a two stage process. The first stage builds up a list of horizontal (or vertical etc.) line spans. This list is a series of both screen and texture-space coordinates which represent the ends of each scan-converted polygon line. During this stage NO pixels (or texels) are drawn, only their coordinates are calculated and stored in the span buffer lists. Overlapping line regions are either clipped or replaced by later line-spans which are added to these lists. The second stage is the "filler" part. This takes the resulting span lists from stage one and fills between the end points with pixels.
Now this S-BUFFER method can be extended to include transparent or semi-transparent material such as water, windows or even force-fields. I have previously described how new line-spans are added into the span buffer lists and how it is possible that older line-spans can be overwritten or partially covered by newer ones. But instead of just losing the older pixels in a span why not use the new span to colour or shade them?
e.g.
aaaaaaaaaaaaaa <--- span 1
bbbbbbbbbbbbb <--- span 2
==================== <--- span 3
In the above diagram we have 3 spans where 1 is the oldest and span 3 is a semi-transparent span. Instead of removing/clipping span 1 and 2 we could shade or tint them using our new span 3. This would modify half of span 2 (the bbb... line) and all of the clipped part of span 1 (the aaa... line).
You don't have to stop at just shading or colouring you could use this method for water or fencing effects where a number of texture bitmaps are overlaid. In the case of water this is normally a wrap-around scrolling texture bitmap which colours other spans blue and perhaps has a few frames of animated white specks on top.
Here is a short list of ideas for span modifiers:
1. Colour/Tint - just like coloured filters
(emergency lighting etc.)
2. Shade - reduces lighting amount
(like fog, mist)
3. Fences - overlay 2 or more textures
(water ripples, rain, static fxs)
4. Lensing - curved lens like effects
(magnifying glass)
5. Mirroring - flipping the X or Y axis
4: The Speed Issue
The problem with using newer spans to modify older ones is the issue of speed. With each new span comes the task of processing all the older ones underneath it. Where as purely opaque spans would clip or replace older spans, so by reducing the number of them, these colour tinting/shading spans do not, they just modify the previous ones. In affect we need to revisit and update old spans to gain the span effects we want. If a highly complex scene is being rendered with small polygons then this can get quite slow.
One solution could be to create a seperate S-BUFFER for the shading/tinting effect(s) rather than trying to combine both at the same time. This way the insertion process will have fewer checks because solid spans will be contained in their own S-BUFFER and the modifying shade/tint ones in their own.
5: The Right Direction ?
The good thing about the S-BUFFER technique is that the direction of the view-point is not important. It deals with hidden surfaces by clipping their texture and screen coordinates rather than overdrawing their pixels. In other words we can scan items in a random order without regard to depth sorting. We could start at the most distant item and place it's spans into the S-BUFFER first and as other spans are added they may be covered or not. At the end we have the correct line span lists which we can simply fill with pixels without overdraw. The same result is found by choosing the closest item and adding that to the S-BUFFER first or in fact any item in any order, the final result is identical no matter what the order is.
But inserting spans into the S-BUFFER takes time, so the lower the number that we need to insert the better. If we begin with the closest item and add its spans then this should work out faster because new spans will be further away and are much more likely to be hidden, so they will never be inserted into the S-BUFFER lists. The other advantage of this view --> horizon order is that the span modifiers (colour, shade, water, fences etc.) will be processed with each new span added rather than going back and updating older spans.
At this moment in time I think that the 'INLET' method can, if not solve, greatly speed up this process. As it is already a view-2-horizon (slab-casting) method then some/all of the depth sorting problems have already been taken care of.